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ABSTRACT 

This paper presents some Bayesian theories of 
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subjects from a population. An obvious example is the use of 
individualized instruction in education. Compared with separate 
optimization, a simultaneous approach has two advantages. First, test 
scores used in previous decisions can be used as "prior" data in 
later decisions, and the efficiency of the decisions can be 
increased. Second, more realistic utility structures can be obtained 
defining utility functions for earlier decisions on later criteria. 
An important distinction is made between weak and strong decision 
rules. As opposed to strong rules, weak rules are allowed to be a 
function of prior test scores. Conditions for monotonicity of optimal 
weak and strong rules are presented. Also, it is shown that under 
mild conditions on the test score distributions and utility 
functions, weak rules are always compensatory by nature. To 
illustrate this approach, a common decision problem in education and 
psychology, consisting of a selection decision for treatment followed 
by a mastery decision, is analyzed. (Contains 1 figure, 2 tables, and 
23 references.) (Author) 
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Abstract 



This paper presents some Bayesian theory for simultaneous optimization of decision 
rules for test-based decisions. Simultaneous decision making arises when an 
institution has to make a series of selection, placements, or mastery decisions with 
respect to subjects from a population. An obvious example is the use of 
individualized instruction in education. Compared with separate optimization, a 
simultaneous approach has two advantages. First, test scores used in previous 
decisions can be used as "prior data" in later decisions, and the efficiency of the 
decisions can be increased. Second, more realistic utility structures can be obtained 
defining utility functions for earlier decisions on later criteria. An important 
distinction is made between weak and strong decision rules. As opposed to strong 
rules, weak rules are allowed to be a function of prior test scores. Conditions for 
monotonicity of optimal weak and strong rules will be presented. Also, it will be 
shown that under mild conditions on the test score distributions and utility functions, 
weak rules are always compensatory by nature. To illustrate the approach, a 
common decision problem in education and psychology, consisting of a selection 
decision for a treaunent followed by a mastery decision, is analyzed. 
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Introduction 

Over the past two decades, Bayesian decision theory has proven to be very 
useful in solving problems of test-based decision making. Historically, the first 
decision making problem to draw the interest of psychometricians was the selection 
problem in education and personnel management. Important milestones in the history 
of the treatment of selection decisions were the publication of the Taylor-Russell 
(1939) tables and Gronbach and Gleser's (1956) Psychological tests and personnel 
decisions . However, in spite of some of the theoretical notions in the latter, it was 
not after an extensive discussion on "culture-fair" selection (Gross & Su, 1975) that 
selection decisions were fully treated as an instance of Bayesian decision theory 
(Novick & Petersen, 1976). 

With the advance of such modem instructional systems as individualized 
study systems, mastery learning, and computer-aided instruction (CAI), interest was 
generated in the possibility to put the problem of mastery testing on sound decision- 
theoretic footing. In mastery testing, the intent is to classify examinees as "masters" 
or "nonmasters" on the basis of their test scores, using some standard of mastery set 
on the true-score scale underlying the test scores. Hambleton and Novick (1973) 
were the first to point at the possibility of applying Bayesian decision theory to 
mastery testing. Optimal mastery rules for various utility or loss functions are 
derived in Davis, Hickman and Novick (1973), Huynh (1976, 1977, 1980) and van 
der Linden and Mellenbergh (1977). 

Interest in decision making problems in modem instructional systems has 
also led to the consideration of two other 'ypes of decision making: placement and 
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classification decisions. In either type of decision making, test scores are used to 
assign examinees to one of the instructional treatments available. However, with 
placement decisions the sucoss of each of the treatments is measured by the same 
criterion whereas in classification decisions each treatment involves a different 
criterion. The paradigm underlying placement decisions is the Aptitude-Treatment 
Interaction (ATI) hypothesis, which assumes that students may react differentially 
to instructional treatments, and, therefore, that different treatments may be best for 
different students. Classifications decisions are made if an instructional program has 
different tracks each characterized by different instructional objectives. Such tracking 
can be found in systems of comprehensive secondary education or vocational 
education. Bayesian decision theory for placement and classification decisions is 
given in Saywer (1993) and van der Linden (1981, 1987). 

Typically, instructional systems as CAI do not involve one single decision 
but can be conceived of as networks of nodes at which one of the types of decisions 
above has to be made (van der Linden, 1990; Vos, 1990, 1991, 1993). An example 
is an instructional network starting with a selection decision, followed by several 
alternative- instructional modules through which students are guided making 
placement and mastery decisions, and which ends with a summative mastery test. 
Decisions in CAI networks are usually based on small tests (which often consist of 
only a few multiple-choice items). 

The question is raised how such networks of decisions should be optimized. 
An obvious approach is to address each decision separately, optimizing its decision 
rule on the basis of test data exclusively gathered for this individual decision. This 
approach is common in current design of instructional systems. The purpose of the 
present paper is to show that multiple decisions in networks can also be optimized 
simultaneously. The advantages of a simultaneous approach are twofold. First, data 
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gathered earlier in the network can be used to optimize later decisions. The use of 
such prior information can be expected to enhance the quality of the decisions-in 
particular if only small tests or sets of multiple-choice items are administered at the 
individual decision points. Second, a more realistic definition of utility or loss 
functions is possible, since these functions can now be defined on the ultimate 
success criterion in the complete network instead of on intermediate criteria 
measuring the success on individual treatments. In this paper, a simple decision 
network of a selection decision followed by one treatment and a mastery decision 
will be used to make our point. First the selection-mastery problem will be 
formalized. Then important distinctions will be made between weak and strong as 
well as monotone and nonmonotone decision rules. Next, a theorem will be given 
showing under what conditions optimal rules will be monotone. Finally, results from 
an empirical example will be presented to illustrate the differences between a 
simultaneous and a separate approach. 



The Selection-Mastery Problem 

A flowchart of the selection-mastery problem is given in Figure 1. An 
example of the problem is an instructional module with a pretest and a posttest. 



Figure 1 about here 



The pretest is administered to select students for the module. It is assumed that the 
possible actions are to admit or to reject the student lor the module. The posttest is 
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used to decide whether or not the students have mastered the objectives of the 
module. Typically, the posttest is an unreliable ^presentation of the objectives, and 
the criterion is supposed to be a threshold on the true score underlying the test The 
possible actions are to classify a student as a master or a nonmaster. 

For a randomly sampled student, let the observed scores on the selection 
and mastery tests be continuous random variables denoted by X and Y, with 
realizations x and y, respectively. Also, it is assumed that, due to measurement error 
in the mastery test, the criterion to be considered is the classical test theory true 
score underlying the mastery test. Let the true score for a randomly sampled 
individual be denoted by a continuous random variable T with realization t. 

Further, it will be assumed that the relation between X, Y, and T can be 
represented by a joint density function f(x,y,t). It is important to note that the best 
experiment to estimate the parameters in this density function is the one in which 
a sample of examinees from the full marginal distribution of X is admitted to the 
treatment and the performances of these students on the mastery test Y are 
measured. Though it is possible to estimate the parameters from a distribution of X 
truncated by the fact that low performing students are not admitted to the treatment, 
such estimates need a parametric model for the density, which might be wrong 
and/or poorly estimated. 

Finally, it is assumed that the standard denoting true mastery is a threshold 
value t„ on T. 
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Simultaneous Decision Rules 

Let each of the possible actions be denoted by ay (ij=0 : .l), where i=0,l 
stand for the actions of rejecting and accepting a student and j=0,l for the actions 
of retaining and advancing an accepted student. Since for a rejected student no 
further mastery decisions are made, the index j will be dropped for i=0. 

Generally, a decision rule specifies for each possible realization (x,y) of 
(X,Y) which action a^ is to be taken. 

Weak and Strong Rules 

The decision rule for the mastery decision may or may not depend on the 
score X on the selection test. Intuitively, one can imagine that the fact that a student 
has delivered a high performance on the selection test leads to a more lenient rule 
for the mastery decision because this prior information implies that a possible low 
score on the mastery test Is more likely due to measurement error than to a true low 
performance. Simultaneous rules in which decisions are a function both of the 
current test score and previous test scores test will be called weak rules in this 
paper. As a general result, it will be proven that under obvious conditions weak rules 
will necessarily have a compensatory nature. The title of the paper already alludes 
to this result. 

If decisions are only a function of current test scores, the rules will be 
called strong (simultaneous) rules. 

For the decision network of Figure 1 a weak simultaneous rule 5 can be 
defined as: 
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{(x,y) : 8(x.y) = Sq) = A x R 

{(x,y):5(x,y) = a 10 }=A c xB(x) (1) 
{(x,y):5(x,y) = a 11 } = A c xB c (x), 

where A, A c , B(x), and B c (x) stand for, respectively, the sets of x and y values for 
which a random student is rejected or admitted for a treatment and failed or passed 
the mastery test. R represents the set of real numbers. 

With strong rules, the sets B(x) and B c (x) are independent of x. Strong 
simultaneous rules can only be optimal if certain conditions are met. These 
conditions will be given below. 



Monotone and Nonmonotone Rules 

Decision rules can take a monoto;:; or a nonmonotone form. A decision 
rule is monotone if cutting scores are used to partition the sample space into regions 
for which different actions are taken. For example, a (separate) rule for the selection 
decision is monotone if there exists a cutting score x £ such that all examinees with 
\>x c are admitted and those with X<x £ are rejected. All other possible rules are 
i onmonotone. 

For our decision problem, a weak monotone rule 8 can be defined as: 



ag for X < x £ 

5(X,Y) = ^ a 1Q for X > x c , Y < y c (x) (2) 

a n for X > x c , Y > y c (x), 

with y c (x) being the cutting score on Y. The fact thai this cutting score is written 
as a mathematical function of x will be justified below proving thai y c (x) is unique 
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for each value of x under reasonable assumptions. 

In this papa , the interest will mainly be in monotone rules. The reason for 
this choice is the fact that the use of cutting scores is common practice in 
educational and psychological testing, and that rules with a different form are 
frequently not acceptable. However, the restriction to monotone rules is correct only 
if it can be proven that for any nonmonotone rule for the problem at hand there is 
a monotone rule with at least the same value on the criterion of optimality used; that 
is, if the subclass of monotone rules is essentially complete (Ferguson, 1967, p. 55). 
Conditions under which the subclass of monotone (simultaneous) rules is essentially 
complete for the present problem will also be given below. 

Strong Monotone Rules with Maximum Expected Utility (SMMEU) 

To evaluate the use of cutting scores even if conditions for monotonicity 
are not know to hold, the case of Strong Monotone Rules with Maximum Expected 
Utility (SMMEU rules) is also considered. A SMMEU rule is a rule with maximum 
expected utility in the subclass of strong monotone rules. The attention for SMMEU 
m s is motivated by the fact that educators are familiar with cutting scores as 
decision rules and do not have a tradition of bothering about their justification. 

Thus, if the sets of conditions for both strong and monotone rules to be 
optimal are satisfied, the subclasses of SMMEU and strong monotone Bayes rules 
are identical. Otherwise, they differ. 
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Utility Structure 

Generally, a utility function describes the utility of each possible action for 
the possible true states of nature. Here, the utilities involved in the combined 
decision problem are defined as the following additive structure 

u.j(t) = w lU >'(t) + w 2 uj (m) (t), (3) 

where uj^(t) and u^ m \\) represent the utility functions for the separate selection 
and mastery decisions and Wj and w 2 represent nonnegative weights, respectively. 
Since utility is supposed to be measured on an interval scale, the weights of (3) can 
always be rescaled as follows: 



Ujj(t) = wuj (s) (t) + (l-w)Uj (m) (t), (4) 



where 0 £ w < 1. For a rejected student, zero contributions to the utility for the 
separate mastery decision are assumed. Hence, it follows from (4) that Uqj(I) is equal 
to wuq^O) for all j. 

It should be noted that the first term of (3) and (4) is a function of t and 
not, for example, of a true score underlying X. This fact illustrates one of the 
advantages of a simultaneous approach to decision making, namely, that there is no 
need to resort to intermediate criteria of success but that for all decisions utility can 
be defined as a function of the ultimate criterion in the network. 
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Below more specific functions u^ s \t) and Uj^ m \t) will be adopted. 
Obviously, these functions will be chosen such that utility will be an increasing 
function of t for the admittance and mastery decision but decreasing functions for 
the rejectance and nonmastery decision. First, however, more general results will be 
presented. 

Expected Utility in the Simultaneous Approach 

For the decision rules in (1) and the utility structure in (4), the expected 
utility for the two decision rules is equal to, 

E[U siin (A c ,B c (x))l = J" J J wu 0 S) (t)f(x,y,t)dtdydx + 
A R R 

j* j" J u 10 (t)f(x,y,t)dtdydx + 
A c B(x) R 

J J" J" u n (t)i"(x,y,t)dtdydx. 

A c B c (x) R 

In a Bayesian fashion, the expected utility in (5) will be taken as the criterion of 
optimality in this paper. 

Taking expectations, completing integrals, and rearranging terms, (5) can 
be written as 

E[U sim (A c ,B c (x))] = wE[uJ°(T)) + J {E[u, 0 (T)-wuJ°(T)|x) + 

A c (6) 
J" E(u n (T)-u 10 (T)|x,y]h(y|xV1y}q(x)dx, 
B c (x) 



(5) 
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where q(x) and h(y | x) denote the p.d.f.'s of X and Y given X = x. 

It is interesting to note that the critical quantities in (6) are the posterior 
expected utilities given X=x and (X=x,Y=y). It is through these quantities that 
information from prior tests will play a role in later decisions in the network. 



Sufficient Conditions for Monotone Rules 

In this section, monotonicity conditions for the simultaneous rules are 
derived. First, sufficient and necessary conditions for monotone solutions for the 
separate selection and mastery decisions will be given. Next, sufficient conditions 
for weak monotone solutions will be derived. Finally, monotonicity conditions for 
strong simultaneous rules will be derived from the previous case by imposing 
additional restrictions on the test-score distributions. 

Conditions for Separate Selection and Mastery Decisions 

Conditions necessary and sufficient for selection and mastery rules to b; 
(strictly) monotone are given in Chuang, Chen and Novick (1981). Two sets of 
conditions must be met First, the families of distributions of the true scores T given 
X=x and T given Y=y must be stochastic increasing; that is, their cumulative 
distribution functions (c.d.f.'s) must be decreasing in x and y for all t. Second, the 
utility functions must be monotone. This condition requires the difference between 
the utility function for the rejection (nonmastery) and admittance (mastery) decision 
to change sign at most once. 
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Both conditions immediately follow from the standard decision problem 
addressed in statistical decision theory (e.g., Ferguson, 1967; Lindgren, 1976). 

Conditions for Weak Simultaneous Rules 

Let V(tlx.y) denote the c.d.f. of T given (X=x,Y=x) and H(ylx) the c.d.f. of 
Y given X = x. The following theorem gives a set of conditions sufficient for a 
weak monotone solution: 

Theorem . An optimal simultaneous decision rule for the selection-mastery problem 
is (weak) monotone if: 

Uj (,n) (t) - u Q (m) (t) is strictly increasing in t, (7) 
Ujq(0 - wuq^(0 is strictly increasing in t, (8) 
V(t | x,y) is strictly decreasing in x and y for all t, (9) 
H(y | x) is stricly decreasing in x lor all y. (10) 

The first condition guarantees monotone utility for the mastery decisions. 

The second condition stipulates that the difference between the utility 
functions for the actions ajQ (acceptance, nonmaslery) and ag (rejection) be an 
increasing function of t. 

The third condition requires double (strict) stochastic increasingness for 
V(tlx,y). Loosely speaking, this condition is met if high true scores on the mastery 
lest coincide with high observed scores on both the selection and mastery tests. 
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The last condition also requires (strict) stochastic increasingness, and thus 
that high scores on the mastery and selection test tend to coincide. 

Not all conditions in this set are straightforward generalizations of the 
conditions for the separate decision problems. In particular, the conditions in (8) and 
(10) are new; they are needed to link the two separate decision problems. 

It should be noted that there is no condition analogous to (7) for the 
selection problem. This is due to the fact that the utility component for this problem 
is defined on the true score variable for the mastery test. 

In the proof of the theorem, the following lemma's are needed: 

Lemma 1: Let f(x) be an arbitrary function with \ | f(x) | dx < °°, then for any set S 
of x values it holds that J^f(x)dx < J<j.f(x)dx with S' = (x: f(x) > 0} (e.g., Ferguson, 
1967, p. 201). 

Lemma 2 : For any increasing function k(t), the expectation E[k(T)lz] is an increasing 
function of z if and only if the c.d.f. of T given Z=z is stochastic increasing (e.g., 
Lchmann, 1959, p. 74). 

Observe that if k(t) is a constant E[k(T)lz)] is a constant too. Hence, the 
nondecreasing version of the lemma also holds. 

Lemma 3 : If (9) and (10) hold, then the marginal c.d.f. P(tlx) associated with 
V(tlx,y) is stochastic increasing in x. 

This lemma is proven as follows: _Let v(tlx,y) be the p.d.f. of T given X =x and 
Y = y. By definition. l-P(t!x)= J" J" v(/.lx,y)h(y!x)dyd/. = J [ 1 -V(tlx.y)]!i(yK)dy. 
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From (9), (10) and Lemma 2, it follows that l-P(tlx) increases in x for all t, i.e., that 
P(tlx) is stochastic increasing in x. ■ 

For completeness' sake, it is observed that the c.d.f. of T given Y = y is also 
stochastic increasing if (10) is replaced by the stronger condition of monotone 
likelihood ratio. However, this result is not needed in the remainder of this paper. 

Lemma 4. If a function K(x,y) is (strictly) increasing in x and y, then the relation 
defined by C = {(x,y)lK(x,y)=c, c e R) is a decreasing function in x. 

To proof this lemma, assume that there are two pairs (xj.yj) e C and 
^ x 2 ,y 2^ e witn x 2 > x i> f° r which > yj. Then, by hypothesis, 
K ( x 2 ,v 2^ * K ^ x l*>'P* which contradicts the assumption. ■ 

Proof of Theorem 

Applying Lemma 1 to the second term in the integral in (6), and using 
h(y | x) > 0, it follows that for all B c (x) and an arbitrary but fixed A c : 




A 



c 




(11) 



with 



B 0 c (x)= (y: E[u n (T)-u 10 (T)|x,y] >0). 



(12) 
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Again, applying the theorem to the second term in the right-hand side of (11), and 
using q(x) > 0, it follows that for all A c 

E[U sim (A c .B 0 C (x))] < wE[u£°(T)] + J {E[u 10 (T)-wu^ s) (T)|x] ♦ 

(13) 

J E[u 11 (T)-u 10 (T)|x,y]h(y|x)dy}q(x)dx, 
B 0 C (x) 

witli 

A Q C . { x: E[u 10 (T)-wuJ s) (T)|x] + 

J EIu n (T)-ui 0 (T)|x,y]h(y|x)dy > 0 ). (i4) 
B Q C (x) 

It is now proven that the left-hand sides of the inequalities in (12) and (14) 
increase in y for all x and in x for all y, respectively. If these features hold, then (6) 
is maximal for the sets Aq C =(x c ,<») and BQ C =(y c (x),«>), where x c and y c (x) are the 
values of x and y for which the inequalities in ( 12) and (14) become equalities. (The 
numbers x c and y c (x) may be infinitely small or large implying that the same 
decisions have to be made for all examinees.) 

(i) Since u, ,(t)-u 10 (t) = (l-w)[u/ ,n) (t)-u 0 (,n) (t)l and 1-w >0, it follows from 
the condition in (7) that the difference between these two utilities is increasing too. 
Therefore, (9) and Lemma 2 together imply that 
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E[uj j(T)-UjQ(T)lx,y] is increasing in y for all x (15) 
and in x for all y, 

and thus that the sets Bq c (x) take the required form ly c (x),«) for all values of x. 
This result will be used in the following part of the proof, 
(ii) From (8)-(10), Lemma 2 and Lemma 3, it follows immediately that the first 
term in the left-hand side of (14) is increasing in x. 

For notational convenience, the term E[uj j(T)-ujQ(T)lx,yJ is denoted as t(x,y). 
Note that t(x.y) is an increasing function of y which is nonnegative for y > y c (x) 
for all values of x. Now for any X2 > Xj, it follows from Lemma 4 that 



J* T(x 2 ,y)h(y|x 2 )dy - J* x(x,,y)h(y|x l )dy > (16) 



J" T(x 2 ,y)h(y|x 2 )dy - j* t(xj,y)h(y |xj)dy > 
y c (xi) y c (xi) 

J T(xj,y)|h(y|x 2 ) - h(y|xj)]dy - 

y c ("i) 



/ ?(y)Ih(y|x a )-h(y|x 1 )]dy > 



where ^p(y) = 1^ ^^{y)^i,y)- By definition, y(y) is a nondecreasing function of 
y, and it follows from (10) and Lemma 2 that (16) is positive. Hence, it can be 
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concluded that the second left-hand term in (14) is increasing in x, and thus that the 
set Ay C takes the required form [x c ,°o). ■ 

Monotonicity Conditions for Strong Simultaneous Rules 

For strong simultaneous rules, Bq c (x) is not allowed to depend on x. 
Therefore, as an additional condition, it must hold for v(tlx,y) and the p.d.f. of T 
given Y=y that 

v(tlx,y)=g(tly). (17) 

This condition, which immediately follows from (12), implies that all information 
on T relevant for the decision is contained in Y=y, and that, once Y=y is given, the 
observation X=x does not add any information. If the condition holds, then, 
obviously, the use of simultaneous rules will not add any efficiency to the decision 
making procedure. 



Calculation of Simultaneous Rules 

From the theorem it follows that the optimal weak and strong simultaneous 
rules can be calculated as the points at which the inequalities in (12) and (14) turn 
into equalities. For the weak rules, it should be noted that the sets Bq C (x) and Aq C 
are sequentially defined. First, for all values of x, the sets Bq C (x) are defined by 
(12). Only then the set Aq C is defined by (14). Hence, optimal weak rules have to 
be calculated in this sequence. 



f -1 
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SMMEU rules can be calculated solving the system of equations consisting 
of the partial derivatives of (6) w.r.t. x c and y c equated to zero. 

In the empirical example below, for the calculation of all cutting scores 
Newton's method for solving nonlinear systems was used. The method was 
implemented in a computer program called NEWTON. Another program, UTILITY, 
was written to analyze differences in expected utility for the various rules. Copies 
of the programs are available from the authors of the paper upon request. 



It is observed that optimal rules for the separate decisions can easily be 
found by imposing certain restrictions on E(U s j m (A c , B c (x))]. 

First, substituting w = 1 into (6), the expected utility for the separate 



Next, substituting w = 0, A = R (i.e., accepting all students for the 
instructional treatment), and B c (x) = B c into (6) gives the following result for the 
expected utility of the separate mastery decision: 



Optimal Separate Rules 



selection decision E[l/ S ^(A C )], can be written as 




(18) 




(19) 
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where s(y) denotes the p.d.f. of Y. 

Analogous to the simultaneous approach, it can easily be verified that upper 
bounds to E[U (s) (A c )l and E[U (m) (B c )] are obtained for <he sets of x and y values 
for which E[u j (s) (T)-u 0 (s) (T) | x] and E[uj (m) (T)-u 0 (m) (T) | y] are nonnegative, 
respectively. Assuming that the monotonicity conditions for the separate decisions 
are satisfied, the optimal cutting scores for the separate selection and mastery 
decisions, say x c and y c , can be obtained by solving Etu^CD-Uf/fyT) | x] and 
E[uj (m) (T)-u 0 (m) (T)|y] for x c and y c , respectively. For further details, see 
Mellenbergh and van der Linden (1981) and van der Linden and Mellenbergh 
(1977). 

An Empirical Example 

Optimal rules were calculated for a selection-mastery decision problem 
consisting of a CAI module on elementary medical knowledge preceded and 
followed by a selection and mastery test, respectively. Both tests consisted of 21 
items and had possible test scores ranging from 0-100. Data were available for a 
sample of 76 freshmen in a medical program. The instructors in the program 
considered student as having mastered the module if their true scores were larger 
than 55. Therefore, t c was fixed at this value. 

Score Distributions 

It was assumed that (X,Y,T) followed a trivariate normal distribution. Under 
this assumption, the bivariate distribution of (X.Y) is also normal. Further, the 
regression function H[Ylx] is linear. 
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These two observable consequences were tested against the data using a 
chi-square and a t-test. The probabilities of exceedance were 0.219 and 0.034, 
showing a satisfactory fit which confirmed our visual inspection of various plots of 
the distributions. 

Some descriptive statistics for the two tests are given in Table 1 . 



Table 1 about here 



Utility Structure 

The following choice was made for the functions u^ s \t) and Uj^ In \t) in 

(4): 

b 0 (s) (t c -t) + d Q (s) for i = 0 

Uj (s) (t) = { (20) 
b, (s) (t-t c ) + d t (s) for i = 1 



Uj (m) (t) = { 



b 0 (m) (t c -t) + d 0 (m) forj = 0 

bl (>n) (Mc) + di (m) forj=l 



M) 



where b Q ^ s \ bj* m * > 0 (i.j = 0,1). The parameters d^ s ' and dj^ n) can represent, for 
example, the fixed amount of costs involved in following an instructional module 
and testing the examinees. The condition bg^ s \ bj^ > 0 states that utility be a 
decreasing function for the rejection decision, but an increasing function for the 
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acceptance decision. Similarly, the condition b^ m \ bj^ m ^ > 0 expresses that the 
utilities associated with failing and passing the mastery test be decreasing and 
increasing functions in t, respectively. 

The same utility functions were used in an analysis of separate selection 
and mastery decisions in Mellenbergh and van der Linden (1981) and van der 
Linden and Mellenbergh (1977). For other possible utility functions, see Novick and 
Lindley (1979). 



Monotonicity Conditions 

The condition in (7) is met since bj^ > 0, j=0,l. 

It can easily be verified that the condition in (8) is satisfied if the weight 
w and the parameters oq^, b^ $ \ and are chosen such that 

w > b 0 (m V(b 0 ( s) +b j (s) +b 0 ( m >). (22) 



All numerical values for the utility parameters in the example were chosen to meet 
these two requirements. 

Under the model of a trivariate normal distribution for (X,Y,T) in this 
example, the conditions in (9)-(10) were met by the positive slopes of the regression 
lines and planes in this distribution. 

Finally, the additional condition for solutions to be strong monotone in (17) 
was tested comparing the two regression lines E[Tlx,y] and E(Tly] using an F-test. 
The probability of exceedance was 0.038, indicating that the result was just 
significant for oc=.05. Therefore, only SMMEU rules and no optimal strong rules 
were considered. 
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Results for the Simultaneous Rules 

For several values of the utility parameters, weak monotone and SMMEU 

»ules were calculated. The results are reported in Table 2, where the cutting scores 

* * 

for the SMMEU rules are denoted as x c and y c . 



Table 2 about here 

As is clear from the results, the consequences of increasing the values of 
the parameters and bj^ m ^ were decreases of the optimal weak and SMMEU 
cutting scores on the selection test. On the other hand, a decrease of the amount of 
constant utility, dj^ and d-^ m \ resulted in increases of the optirral weak and 
SMMEU cutting scores on the selection test Furthermore, Table 2 indicates that the 
optimal weak and SMMEU cutting scores on the selection test increase in w for 
utility structures (l)-(3) and (4)-(6) in Table 2, whereas the opposite holds for utility 
structures (7)-(9) in the table. 

Results for the Separate Approach 

The optimal cutting scores x c and y„ for the separate selection and mastery 
decisions are also reported in Table 2. In particular for w = 0.3, the weak cutting 
scores y c (x c ) on the mastery test generally were high compared with y c . 

The results did not differ much from those obtained for the weak monotone 
rules. This fact can be explained as follows: Students who were just accepted in the 
case of a weak monotone rule had to compensate their rather low cutting scores on 
the selection lest with relatively high scores on the mastery test compared with 
students accepted in the case of separate rules. However, the decreasing character 
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of y c (x) in x implied that only students accepted with selection scores equal to or 
just above x c did need these rather high scores on the mastery test to reach the 
mastery status. 

Comparison of the Expected Utilities 

For the simultaneous approach a gain in expected utility relative to the 
separate approach was expected. To see whether this expectation could be 
confirmed, the weighted sum of the expected utilities for the optimal separate rules 
was compared with the expected utilities for the optimal weak monotone rules. The 
results are also displayed in Tabic 2. 

It can be seen that the expected utilities for the optimal weak monotone 
rules yielded the largest values for all utility structures. This result was in 
accordance with our expectations. Furthermore, Table 2 indicates that the expected 
utilities for the optimal weak monotone rules were only slightly larger than for the 
SMMELI rules. Finally, the table shows that for all three approaches, the expected 
utility yielded the largest value for w = 0.9. In other words, the utility for the 
selection decision contributed most to the expected utility for the optimal 
simultaneous rules in this study. 

Concluding Remarks 

For a monotone utility structure, Lemma 4 shows that under the natural 
condition of the selection and mastery test scores being stochastic increasing in the 
true score on the mastery test, weak cutting scores for mastery decisions are a 
decreasing function of the scores on the selection test. As already explained, this 
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feature introduces an element of compensation in the decision procedure: It is 
possible to compensate low scores on the mastery test by high scores on the 
selection test A quantitative estimate of this effect can be calculated for the data set 
in the empirical example above. Substituting the estimated regression plane 
E[Tlx,y] = a + Bx +yy into the left-hand-side in (12) and solving for y c (x) yields 

y e (x) = Kfl a w -A w )^+b 1 w )+i s -a-px]/T. 

The derivative of this equation w.r.t. x is equal to - p/y. which for the data set was 
estimated as -.675. It follows for all utility structures in this example that the cutting 
score y c (x) on the mastery test has to be lowered by .675 for each score point above 
x c on the selection test. 

Although the area of individualized instruction is a useful application of 
simultaneous decision making, it should be emphasized that the optimization models 
advocated in this paper have a larger scope of application. For any situation in 
which subjects are accepted for a certain treatment on the basis of their scores on 
a selection test with attainments evaluated by a mastery test, the optimal rules 
presented in this paper can improve the decisions. An example is psychotherapy 
where clients accepted have to pass a success criterion before being dismissed from 
the therapy. 
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Table 1 



Statistics Selection and Mastery Tests (X and Y) 



Statistics 



Mean 50.679 62.436 

Standard Deviation 8.781 9.456 

Reliability 0.773 0.802 

Correlation 0.75 1 
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Figure Caption 



Figure 1. A system of one selection and one mastery decision. 
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